Introduction

This analysis explores diplomatic representation data using network analysis techniques to gain deeper insights into international status. We use the Diplometrics dataset which covers diplomatic representation from 1960 to 2020, providing a comprehensive view of how status through recognition has evolved over time.

Data Preparation

First, we’ll load the Diplometrics dataset:

# Load the Diplometrics data
diplometrics <- read_excel("/Users/yutianyi/Desktop/MA thesis data creation/Diplometrics Diplomatic Representation 1960-2020_20211215.xlsx", 
                          sheet = "Data")

# Examine the data structure
str(diplometrics)
## tibble [413,582 × 12] (S3: tbl_df/tbl/data.frame)
##  $ Destination          : chr [1:413582] "Afghanistan" "Afghanistan" "Afghanistan" "Afghanistan" ...
##  $ Destination Region   : chr [1:413582] "Asia" "Asia" "Asia" "Asia" ...
##  $ Destination Subregion: chr [1:413582] "Southern Asia" "Southern Asia" "Southern Asia" "Southern Asia" ...
##  $ Sending Country      : chr [1:413582] "Czechoslovakia" "Egypt" "France" "Germany" ...
##  $ Sending Region       : chr [1:413582] "Europe" "Africa" "Europe" "Europe" ...
##  $ Sending Subregion    : chr [1:413582] "Eastern Europe" "Northern Africa" "Western Europe" "Western Europe" ...
##  $ Year                 : num [1:413582] 1960 1960 1960 1960 1960 1960 1960 1960 1960 1960 ...
##  $ Location             : chr [1:413582] "Afghanistan" "Afghanistan" "Afghanistan" "Afghanistan" ...
##  $ Embassy              : num [1:413582] 5 6 6 6 6 4 6 5 6 6 ...
##  $ Focus                : num [1:413582] 1 1 1 1 1 1 1 1 1 1 ...
##  $ EmbassyFocus         : num [1:413582] 51 61 61 61 61 41 61 51 61 61 ...
##  $ LOR                  : num [1:413582] 0.75 1 1 1 1 0.75 1 0.75 1 1 ...
head(diplometrics)
## # A tibble: 6 × 12
##   Destination `Destination Region` `Destination Subregion` `Sending Country`
##   <chr>       <chr>                <chr>                   <chr>            
## 1 Afghanistan Asia                 Southern Asia           Czechoslovakia   
## 2 Afghanistan Asia                 Southern Asia           Egypt            
## 3 Afghanistan Asia                 Southern Asia           France           
## 4 Afghanistan Asia                 Southern Asia           Germany          
## 5 Afghanistan Asia                 Southern Asia           India            
## 6 Afghanistan Asia                 Southern Asia           Indonesia        
## # ℹ 8 more variables: `Sending Region` <chr>, `Sending Subregion` <chr>,
## #   Year <dbl>, Location <chr>, Embassy <dbl>, Focus <dbl>, EmbassyFocus <dbl>,
## #   LOR <dbl>

The Diplometrics dataset contains rich information about diplomatic representation:

  • Destination: Country receiving diplomatic representation
  • Sending Country: Country establishing diplomatic representation
  • Year: Year of observation
  • Embassy: Level of embassy (numeric scale)
  • LOR: Level of Representation (standardized measure)
  • Regional classifications for both sending and receiving countries
# Basic summary of the dataset
cat("Year range:", min(diplometrics$Year), "to", max(diplometrics$Year), "\n")
## Year range: 1960 to 2020
cat("Number of unique sending countries:", length(unique(diplometrics$`Sending Country`)), "\n")
## Number of unique sending countries: 209
cat("Number of unique destination countries:", length(unique(diplometrics$Destination)), "\n")
## Number of unique destination countries: 205
# Check the distribution of Embassy and LOR values
cat("\nDistribution of Embassy values:\n")
## 
## Distribution of Embassy values:
table(diplometrics$Embassy)
## 
##      1      2      3      4      5      6 
##  24136    327    216  26159   2506 360238
cat("\nDistribution of LOR values:\n")
## 
## Distribution of LOR values:
table(diplometrics$LOR)
## 
##      0    0.1  0.125  0.375    0.5   0.75      1 
##   1209    320    212    985  13887  51352 345617

Filtering Data by Year

Let’s filter data for specific benchmark years:

# Select benchmark years
benchmark_years <- c(1960,1965,1970, 1975, 1980, 1985,1990,1995,2000, 2005, 2010, 2015, 2020)

# Function to filter data for a specific year
get_year_data <- function(year_val) {
  year_data <- diplometrics %>%
    filter(Year == year_val)
  
  cat("Year:", year_val, "\n")
  cat("  Number of diplomatic relationships:", nrow(year_data), "\n")
  cat("  Number of sending countries:", length(unique(year_data$`Sending Country`)), "\n")
  cat("  Number of destination countries:", length(unique(year_data$Destination)), "\n\n")
  
  return(year_data)
}

# Filter data for each benchmark year
year_datasets <- lapply(benchmark_years, get_year_data)
## Year: 1960 
##   Number of diplomatic relationships: 2762 
##   Number of sending countries: 94 
##   Number of destination countries: 97 
## 
## Year: 1965 
##   Number of diplomatic relationships: 3877 
##   Number of sending countries: 127 
##   Number of destination countries: 120 
## 
## Year: 1970 
##   Number of diplomatic relationships: 4566 
##   Number of sending countries: 136 
##   Number of destination countries: 134 
## 
## Year: 1975 
##   Number of diplomatic relationships: 5509 
##   Number of sending countries: 147 
##   Number of destination countries: 144 
## 
## Year: 1980 
##   Number of diplomatic relationships: 6161 
##   Number of sending countries: 159 
##   Number of destination countries: 159 
## 
## Year: 1985 
##   Number of diplomatic relationships: 6558 
##   Number of sending countries: 167 
##   Number of destination countries: 162 
## 
## Year: 1990 
##   Number of diplomatic relationships: 6827 
##   Number of sending countries: 170 
##   Number of destination countries: 167 
## 
## Year: 1995 
##   Number of diplomatic relationships: 7383 
##   Number of sending countries: 190 
##   Number of destination countries: 190 
## 
## Year: 2000 
##   Number of diplomatic relationships: 7785 
##   Number of sending countries: 192 
##   Number of destination countries: 190 
## 
## Year: 2005 
##   Number of diplomatic relationships: 8273 
##   Number of sending countries: 193 
##   Number of destination countries: 191 
## 
## Year: 2010 
##   Number of diplomatic relationships: 8859 
##   Number of sending countries: 196 
##   Number of destination countries: 195 
## 
## Year: 2015 
##   Number of diplomatic relationships: 9275 
##   Number of sending countries: 198 
##   Number of destination countries: 195 
## 
## Year: 2020 
##   Number of diplomatic relationships: 9897 
##   Number of sending countries: 200 
##   Number of destination countries: 195
names(year_datasets) <- as.character(benchmark_years)

Creating Diplomatic Networks

Now we’ll create network objects for each benchmark year:

# Function to create a diplomatic network from year data
create_diplomatic_network <- function(year_data) {
  # Create edge list with sender -> destination
  edge_list <- year_data %>%
    select(`Sending Country`, Destination, LOR)
  
  # Create the network
  g <- graph_from_data_frame(edge_list, directed = TRUE)
  
  # Set edge weights based on LOR values
  E(g)$weight <- edge_list$LOR
  
  return(g)
}

# Create networks for each benchmark year
diplomatic_networks <- lapply(year_datasets, create_diplomatic_network)

# Calculate basic network metrics
network_metrics <- data.frame(
  Year = benchmark_years,
  Nodes = sapply(diplomatic_networks, vcount),
  Edges = sapply(diplomatic_networks, ecount),
  Density = sapply(diplomatic_networks, edge_density),
  Reciprocity = sapply(diplomatic_networks, reciprocity),
  Average_LOR = sapply(diplomatic_networks, function(g) mean(E(g)$weight, na.rm = TRUE))
)

# Display network metrics
kable(network_metrics, digits = 4,
      caption = "Basic Network Metrics by Year")
Basic Network Metrics by Year
Year Nodes Edges Density Reciprocity Average_LOR
1960 1960 100 2762 0.2790 0.8262 0.9036
1965 1965 129 3877 0.2348 0.7934 0.7514
1970 1970 140 4566 0.2346 0.7950 0.9044
1975 1975 150 5509 0.2465 0.8132 0.9194
1980 1980 163 6161 0.2333 0.8197 0.8440
1985 1985 170 6558 0.2283 0.8231 0.9731
1990 1990 173 6827 0.2294 0.8335 0.9732
1995 1995 193 7383 0.1992 0.8070 0.9698
2000 2000 194 7785 0.2079 0.8108 0.9671
2005 2005 195 8273 0.2187 0.8285 0.9781
2010 2010 197 8859 0.2294 0.8443 0.9811
2015 2015 198 9275 0.2378 0.8412 0.9833
2020 2020 200 9897 0.2487 0.8465 0.9775

Calculating Status Metrics from Network Position

Let’s calculate network centrality measures to capture different dimensions of status:

# Function to calculate status metrics
calculate_status_metrics <- function(graph) {
  # Basic degree centrality (in-degree = recognition received)
  recognition_count <- degree(graph, mode = "in")
  
  # Weighted in-degree (recognition weighted by LOR)
  weight_matrix <- as_adjacency_matrix(graph, attr = "weight", sparse = FALSE)
  weighted_recognition <- rowSums(weight_matrix)
  
  # Eigenvector centrality (recognition from important countries)
  eigen <- tryCatch({
    eigen_centrality(graph, directed = TRUE, weights = E(graph)$weight)$vector
  }, error = function(e) {
    # Fallback if weighted calculation fails
    eigen_centrality(graph, directed = TRUE)$vector
  })
  
  # PageRank (prestige-weighted recognition)
  pagerank <- tryCatch({
    page_rank(graph, weights = E(graph)$weight)$vector
  }, error = function(e) {
    # Fallback if weighted calculation fails
    page_rank(graph)$vector
  })
  
  # Authority scores (being recognized by prestigious countries)
  hits_result <- tryCatch({
    hits_scores(graph, weights = E(graph)$weight)
  }, error = function(e) {
    # Fallback if weighted calculation fails
    hits_scores(graph)
  })
  authority <- hits_result$authority
  
  # Betweenness centrality (diplomatic brokerage)
  # Create an unweighted copy of the graph
  unweighted_graph <- graph_from_edgelist(as_edgelist(graph), directed = TRUE)
  betweenness <- tryCatch({
    betweenness(unweighted_graph, directed = TRUE, normalized = TRUE)
  }, error = function(e) {
    # If normalization fails, try without normalization
    betweenness(unweighted_graph, directed = TRUE, normalized = FALSE)
  })
  
  # Create data frame with results
  data.frame(
    country = names(recognition_count),
    recognition_count = recognition_count,
    weighted_recognition = weighted_recognition,
    eigenvector_centrality = eigen,
    pagerank = pagerank,
    authority = authority,
    betweenness = betweenness,
    recognition_rate = recognition_count / (vcount(graph) - 1)  # Percentage of possible recognitions
  )
}

# Calculate status metrics for each year
status_metrics <- lapply(diplomatic_networks, calculate_status_metrics)
names(status_metrics) <- as.character(benchmark_years)

Status Rankings and Evolution

Let’s examine status rankings and how they’ve evolved:

# Function to get top-ranked countries
get_top_ranked <- function(metrics, measure, n = 15) {
  metrics %>%
    arrange(desc(!!sym(measure))) %>%
    select(country, !!sym(measure)) %>%
    head(n)
}

# Get top 15 countries by PageRank (prestige-weighted status) for 2020
pagerank_top_2020 <- get_top_ranked(status_metrics[["2020"]], "pagerank")
kable(pagerank_top_2020,
      caption = "Top 15 Countries by PageRank Status (2020)",
      col.names = c("Country", "PageRank Score"),
      digits = 5)
Top 15 Countries by PageRank Status (2020)
Country PageRank Score
United States United States 0.02000
Belgium Belgium 0.01802
China China 0.01709
United Kingdom United Kingdom 0.01678
Japan Japan 0.01463
Germany Germany 0.01446
France France 0.01444
India India 0.01424
Russia Russia 0.01379
Switzerland Switzerland 0.01299
South Africa South Africa 0.01281
Turkey Turkey 0.01262
Italy Italy 0.01224
Brazil Brazil 0.01198
Canada Canada 0.01195
# Get top 15 countries by Weighted Recognition for 2020
weighted_top_2020 <- get_top_ranked(status_metrics[["2020"]], "weighted_recognition")
kable(weighted_top_2020,
      caption = "Top 15 Countries by Weighted Recognition (2020)",
      col.names = c("Country", "Weighted Recognition Score"),
      digits = 2)
Top 15 Countries by Weighted Recognition (2020)
Country Weighted Recognition Score
China China 172.50
France France 161.25
United Kingdom United Kingdom 159.25
Germany Germany 153.00
United States United States 152.88
Japan Japan 151.75
Russia Russia 142.50
Turkey Turkey 135.50
Brazil Brazil 133.25
India India 131.00
Italy Italy 126.25
Egypt Egypt 124.25
Cuba Cuba 122.75
Spain Spain 117.75
Korea, Republic of Korea, Republic of 117.50
# Get top 15 countries by Recognition Count for 2020
count_top_2020 <- get_top_ranked(status_metrics[["2020"]], "recognition_count")
kable(count_top_2020,
      caption = "Top 15 Countries by Recognition Count (2020)",
      col.names = c("Country", "Recognition Count"))
Top 15 Countries by Recognition Count (2020)
Country Recognition Count
United States United States 188
Belgium Belgium 183
China China 170
United Kingdom United Kingdom 166
Germany Germany 160
France France 155
Japan Japan 153
India India 150
Russia Russia 148
Switzerland Switzerland 143
Italy Italy 137
South Africa South Africa 133
Turkey Turkey 131
Canada Canada 130
Egypt Egypt 128
# Get top 15 diplomatic brokers for 2020
brokers_top_2020 <- get_top_ranked(status_metrics[["2020"]], "betweenness")
kable(brokers_top_2020,
      caption = "Top 15 Diplomatic Brokers (2020)",
      col.names = c("Country", "Betweenness Centrality"),
      digits = 5)
Top 15 Diplomatic Brokers (2020)
Country Betweenness Centrality
Uzbekistan Uzbekistan 0.09200
Czech Republic Czech Republic 0.06684
United States United States 0.05614
Germany Germany 0.04002
Kazakhstan Kazakhstan 0.03831
Hungary Hungary 0.02848
Indonesia Indonesia 0.02737
Ghana Ghana 0.02348
Saudi Arabia Saudi Arabia 0.02167
Greece Greece 0.02069
Cameroon Cameroon 0.01999
Turkmenistan Turkmenistan 0.01965
Japan Japan 0.01905
Australia Australia 0.01694
Tunisia Tunisia 0.01467

Status Evolution Over Time

Let’s track how status has evolved for selected countries:

# Select major powers to track
major_powers <- c("United States", "China", "Russia", "United Kingdom", 
                 "France", "Germany", "Japan", "India", "Brazil")

# Function to extract data for specific countries across years
track_status_evolution <- function(countries, metric) {
  result <- data.frame()
  
  for(year in benchmark_years) {
    year_data <- status_metrics[[as.character(year)]]
    
    # Filter for selected countries
    country_data <- year_data %>%
      filter(country %in% countries) %>%
      select(country, !!sym(metric))
    
    # Add year column
    country_data$year <- year
    
    # Combine with overall result
    result <- rbind(result, country_data)
  }
  
  return(result)
}

# Track PageRank evolution
pagerank_evolution <- track_status_evolution(major_powers, "pagerank")

# Plot evolution
ggplot(pagerank_evolution, aes(x = year, y = pagerank, color = country, group = country)) +
  geom_line(size = 1) +
  geom_point(size = 3) +
  labs(
    title = "Evolution of Network Status (PageRank) for Major Powers",
    x = "Year",
    y = "PageRank Score",
    color = "Country"
  ) +
  theme_minimal() +
  scale_color_brewer(palette = "Set1")

# Track recognition count evolution
recognition_evolution <- track_status_evolution(major_powers, "recognition_count")

# Plot evolution
ggplot(recognition_evolution, aes(x = year, y = recognition_count, color = country, group = country)) +
  geom_line(size = 1) +
  geom_point(size = 3) +
  labs(
    title = "Evolution of Diplomatic Recognition Count for Major Powers",
    x = "Year",
    y = "Number of Countries Providing Recognition",
    color = "Country"
  ) +
  theme_minimal() +
  scale_color_brewer(palette = "Set1")

Regional Status Patterns

Let’s examine status metrics by region:

# Get regional status by calculating average metrics for countries in each region
get_regional_metrics <- function(year_val) {
  # Get data for specific year
  year_data <- year_datasets[[as.character(year_val)]]
  year_status <- status_metrics[[as.character(year_val)]]
  
  # Get unique destination countries with their regions
  regional_map <- year_data %>%
    select(Destination, `Destination Region`) %>%
    distinct()
  
  # Join with status metrics
  regional_status <- year_status %>%
    left_join(regional_map, by = c("country" = "Destination"))
  
  # Calculate average by region
  regional_averages <- regional_status %>%
    group_by(`Destination Region`) %>%
    summarize(
      countries = n(),
      avg_recognition_count = mean(recognition_count, na.rm = TRUE),
      avg_weighted_recognition = mean(weighted_recognition, na.rm = TRUE),
      avg_pagerank = mean(pagerank, na.rm = TRUE),
      .groups = "drop"
    )
  
  # Add year column
  regional_averages$year <- year_val
  
  return(regional_averages)
}

# Calculate regional metrics for each benchmark year
regional_metrics <- lapply(benchmark_years, get_regional_metrics)
regional_metrics_df <- do.call(rbind, regional_metrics)

# Plot regional recognition trends
ggplot(regional_metrics_df, 
       aes(x = year, y = avg_recognition_count, color = `Destination Region`, group = `Destination Region`)) +
  geom_line(size = 1) +
  geom_point(size = 2) +
  labs(
    title = "Average Diplomatic Recognition by Region (1960-2020)",
    x = "Year",
    y = "Average Number of Recognitions",
    color = "Region"
  ) +
  theme_minimal()

Status Inconsistency in Network Metrics

Let’s calculate status inconsistency based on network position:

# Calculate inconsistency metrics for a given year
calculate_inconsistency <- function(metrics) {
  # Standardize key metrics
  metrics_scaled <- metrics %>%
    mutate(
      recognition_z = scale(recognition_count)[,1],
      pagerank_z = scale(pagerank)[,1],
      authority_z = scale(authority)[,1],
      betweenness_z = scale(betweenness)[,1]
    )
  
  # Calculate inconsistency measures
  metrics_scaled %>%
    mutate(
      # Gap between recognition count and prestige (PageRank)
      quantity_quality_gap = abs(recognition_z - pagerank_z),
      
      # Gap between recognition and being recognized by prestigious countries
      prestige_gap = abs(recognition_z - authority_z),
      
      # Gap between recognition and brokerage
      recognition_brokerage_gap = abs(recognition_z - betweenness_z),
      
      # Overall network inconsistency
      network_inconsistency = (quantity_quality_gap + prestige_gap + recognition_brokerage_gap)/3
    )
}

# Calculate inconsistency for 2020
inconsistency_2020 <- calculate_inconsistency(status_metrics[["2020"]])

# Top countries with highest status inconsistency
top_inconsistent_2020 <- inconsistency_2020 %>%
  arrange(desc(network_inconsistency)) %>%
  select(country, quantity_quality_gap, prestige_gap, recognition_brokerage_gap, network_inconsistency) %>%
  head(15)

kable(top_inconsistent_2020,
      caption = "Countries with Highest Network Status Inconsistency (2020)",
      col.names = c("Country", "Quantity-Quality Gap", "Prestige Gap", "Recognition-Brokerage Gap", "Overall Inconsistency"),
      digits = 4)
Countries with Highest Network Status Inconsistency (2020)
Country Quantity-Quality Gap Prestige Gap Recognition-Brokerage Gap Overall Inconsistency
Uzbekistan Uzbekistan 0.0623 0.1609 8.4296 2.8843
Czech Republic Czech Republic 0.1935 0.2551 5.1475 1.8654
Belgium Belgium 0.2772 0.9420 3.2575 1.4922
United Kingdom United Kingdom 0.3515 0.6249 2.5303 1.1689
United States United States 0.6817 1.0568 1.6397 1.1261
Kazakhstan Kazakhstan 0.0878 0.2725 2.9417 1.1007
China China 0.3399 0.6634 2.2857 1.0963
France France 0.0047 0.3564 2.7573 1.0394
Switzerland Switzerland 0.1030 0.3712 2.5413 1.0052
India India 0.0607 0.2900 2.5595 0.9701
South Africa South Africa 0.0857 0.1524 2.2619 0.8333
Turkey Turkey 0.0841 0.0679 2.2804 0.8108
Canada Canada 0.0712 0.1688 2.1296 0.7899
Russia Russia 0.0108 0.2024 2.0980 0.7704
Italy Italy 0.1589 0.1467 1.9250 0.7436

Data-Driven Composite Status Scores Using PCA

Instead of using arbitrary weights for our composite scores, let’s use Principal Component Analysis (PCA) to determine empirically-based weights:

# Function to create PCA-based composite scores
create_pca_status_scores <- function(year_val) {
  # Get the data for the specified year
  year_metrics <- status_metrics[[as.character(year_val)]]
  
  # Select relevant metrics for different status dimensions
  # 1. Recognition dimension
  recognition_vars <- year_metrics %>%
    select(recognition_count, weighted_recognition, recognition_rate)
  
  # 2. Prestige dimension
  prestige_vars <- year_metrics %>%
    select(pagerank, authority, eigenvector_centrality)
  
  # 3. Brokerage dimension
  brokerage_vars <- year_metrics %>%
    select(betweenness, recognition_count) # Include recognition as it affects brokerage potential
  
  # Run PCA for each dimension
  # Using prcomp with scaling to account for different units
  recognition_pca <- prcomp(recognition_vars, scale. = TRUE)
  prestige_pca <- prcomp(prestige_vars, scale. = TRUE)
  brokerage_pca <- prcomp(brokerage_vars, scale. = TRUE)
  
  # Extract first principal component for each dimension
  recognition_pc1 <- recognition_pca$x[,1]
  prestige_pc1 <- prestige_pca$x[,1]
  brokerage_pc1 <- brokerage_pca$x[,1]
  
  # Ensure positive orientation (higher values = higher status)
  # Check correlation with a key metric for each dimension
  if(cor(recognition_pc1, year_metrics$recognition_count) < 0) {
    recognition_pc1 <- -recognition_pc1
  }
  if(cor(prestige_pc1, year_metrics$pagerank) < 0) {
    prestige_pc1 <- -prestige_pc1
  }
  if(cor(brokerage_pc1, year_metrics$betweenness) < 0) {
    brokerage_pc1 <- -brokerage_pc1
  }
  
  # Scale to 0-1 range for easier interpretation
  recognition_score <- (recognition_pc1 - min(recognition_pc1)) / (max(recognition_pc1) - min(recognition_pc1))
  prestige_score <- (prestige_pc1 - min(prestige_pc1)) / (max(prestige_pc1) - min(prestige_pc1))
  brokerage_score <- (brokerage_pc1 - min(brokerage_pc1)) / (max(brokerage_pc1) - min(brokerage_pc1))
  
  # Create dataframe with results
  result <- data.frame(
    country = year_metrics$country,
    recognition_status_pca = recognition_score,
    prestige_status_pca = prestige_score,
    brokerage_status_pca = brokerage_score
  )
  
  # Display variance explained by each component
  cat("Year:", year_val, "\n")
  cat("Variance explained by recognition PC1:", summary(recognition_pca)$importance[2,1] * 100, "%\n")
  cat("Variance explained by prestige PC1:", summary(prestige_pca)$importance[2,1] * 100, "%\n")
  cat("Variance explained by brokerage PC1:", summary(brokerage_pca)$importance[2,1] * 100, "%\n\n")
  
  # Display loadings to show how metrics contribute to each component
  cat("Recognition dimension loadings:\n")
  print(recognition_pca$rotation[,1])
  cat("\nPrestige dimension loadings:\n")
  print(prestige_pca$rotation[,1])
  cat("\nBrokerage dimension loadings:\n")
  print(brokerage_pca$rotation[,1])
  cat("\n----------------------------\n\n")
  
  return(result)
}

# Calculate PCA-based scores for 2020
pca_scores_2020 <- create_pca_status_scores(2020)
## Year: 2020 
## Variance explained by recognition PC1: 97.672 %
## Variance explained by prestige PC1: 98.111 %
## Variance explained by brokerage PC1: 64.721 %
## 
## Recognition dimension loadings:
##    recognition_count weighted_recognition     recognition_rate 
##           -0.5808640           -0.5702578           -0.5808640 
## 
## Prestige dimension loadings:
##               pagerank              authority eigenvector_centrality 
##             -0.5717408             -0.5802542             -0.5800151 
## 
## Brokerage dimension loadings:
##       betweenness recognition_count 
##        -0.7071068        -0.7071068 
## 
## ----------------------------
# Display top countries by each PCA-based status dimension
kable(pca_scores_2020 %>% 
        arrange(desc(recognition_status_pca)) %>% 
        select(country, recognition_status_pca) %>% 
        head(15),
      caption = "Top 15 Countries by PCA-Based Recognition Status (2020)",
      col.names = c("Country", "Recognition Status Score"),
      digits = 4)
Top 15 Countries by PCA-Based Recognition Status (2020)
Country Recognition Status Score
United States United States 1.0000
China China 0.9730
United Kingdom United Kingdom 0.9316
Germany Germany 0.8968
France France 0.8950
Japan Japan 0.8685
Belgium Belgium 0.8389
Russia Russia 0.8314
India India 0.8156
Italy Italy 0.7581
Turkey Turkey 0.7545
Brazil Brazil 0.7315
Switzerland Switzerland 0.7314
Egypt Egypt 0.7208
South Africa South Africa 0.6985
kable(pca_scores_2020 %>% 
        arrange(desc(prestige_status_pca)) %>% 
        select(country, prestige_status_pca) %>% 
        head(15),
      caption = "Top 15 Countries by PCA-Based Prestige Status (2020)",
      col.names = c("Country", "Prestige Status Score"),
      digits = 4)
Top 15 Countries by PCA-Based Prestige Status (2020)
Country Prestige Status Score
United States United States 1.0000
Belgium Belgium 0.9545
China China 0.9352
United Kingdom United Kingdom 0.9180
Germany Germany 0.8703
France France 0.8695
India India 0.8573
Japan Japan 0.8562
Russia Russia 0.8547
Turkey Turkey 0.7870
South Africa South Africa 0.7867
Italy Italy 0.7864
Switzerland Switzerland 0.7849
Egypt Egypt 0.7655
Brazil Brazil 0.7569
kable(pca_scores_2020 %>% 
        arrange(desc(brokerage_status_pca)) %>% 
        select(country, brokerage_status_pca) %>% 
        head(15),
      caption = "Top 15 Countries by PCA-Based Brokerage Status (2020)",
      col.names = c("Country", "Brokerage Status Score"),
      digits = 4)
Top 15 Countries by PCA-Based Brokerage Status (2020)
Country Brokerage Status Score
United States United States 1.0000
Uzbekistan Uzbekistan 0.9941
Czech Republic Czech Republic 0.8457
Germany Germany 0.7761
Japan Japan 0.5563
Kazakhstan Kazakhstan 0.5212
China China 0.5081
Indonesia Indonesia 0.5029
Belgium Belgium 0.4712
Saudi Arabia Saudi Arabia 0.4674
United Kingdom United Kingdom 0.4634
Hungary Hungary 0.4626
Australia Australia 0.4240
Russia Russia 0.4204
Greece Greece 0.4019
# Let's also create an overall status score using all metrics
create_overall_pca_status <- function(year_val) {
  # Get the data for the specified year
  year_metrics <- status_metrics[[as.character(year_val)]]
  
  # Select all relevant metrics
  status_vars <- year_metrics %>%
    select(recognition_count, weighted_recognition, pagerank, 
           authority, eigenvector_centrality, betweenness)
  
  # Run PCA
  overall_pca <- prcomp(status_vars, scale. = TRUE)
  
  # Extract first principal component
  overall_pc1 <- overall_pca$x[,1]
  
  # Ensure positive orientation
  if(cor(overall_pc1, year_metrics$recognition_count) < 0) {
    overall_pc1 <- -overall_pc1
  }
  
  # Scale to 0-1 range
  overall_score <- (overall_pc1 - min(overall_pc1)) / (max(overall_pc1) - min(overall_pc1))
  
  # Create dataframe with results
  result <- data.frame(
    country = year_metrics$country,
    overall_status_network_pca = overall_score  # Renamed from overall_status_pca
  )
  
  # Display variance explained
  cat("Year:", year_val, "\n")
  cat("Variance explained by overall status PC1:", summary(overall_pca)$importance[2,1] * 100, "%\n")
  
  # Display loadings
  cat("Overall status loadings:\n")
  print(overall_pca$rotation[,1])
  cat("\n----------------------------\n\n")
  
  return(result)
}

# Calculate overall PCA status for 2020
overall_pca_2020 <- create_overall_pca_status(2020)
## Year: 2020 
## Variance explained by overall status PC1: 82.631 %
## Overall status loadings:
##      recognition_count   weighted_recognition               pagerank 
##             -0.4455466             -0.4329208             -0.4405960 
##              authority eigenvector_centrality            betweenness 
##             -0.4437657             -0.4431292             -0.1632527 
## 
## ----------------------------
# Display top countries by overall PCA status
kable(overall_pca_2020 %>% 
        arrange(desc(overall_status_network_pca)) %>% 
        head(20),
      caption = "Top 20 Countries by Overall Network Status (2020)",
      col.names = c("Country", "Overall Network Status Score"),
      digits = 4)
Top 20 Countries by Overall Network Status (2020)
Country Overall Network Status Score
United States United States 1.0000
China China 0.8968
Germany Germany 0.8752
United Kingdom United Kingdom 0.8626
Japan Japan 0.8265
France France 0.8189
Belgium Belgium 0.8072
Russia Russia 0.7902
India India 0.7739
Italy Italy 0.7216
Turkey Turkey 0.7180
Brazil Brazil 0.7020
Egypt Egypt 0.6923
Switzerland Switzerland 0.6918
South Africa South Africa 0.6852
Spain Spain 0.6578
Canada Canada 0.6550
Korea, Republic of Korea, Republic of 0.6241
Saudi Arabia Saudi Arabia 0.6152
Netherlands Netherlands 0.5992

Self-Recognition Analysis

Let’s analyze how countries view themselves through their outgoing diplomatic ties:

# Calculate outgoing diplomatic ties for each country (sending country perspective)
calculate_self_recognition <- function(year_val) {
  year_data <- year_datasets[[as.character(year_val)]]
  
  # Calculate outgoing diplomatic metrics
  sending_metrics <- year_data %>%
    group_by(`Sending Country`) %>%
    summarize(
      outgoing_ties = n(),  # Number of countries recognized
      avg_outgoing_lor = mean(LOR, na.rm = TRUE),  # Average LOR given
      total_outgoing_lor = sum(LOR, na.rm = TRUE),  # Total outgoing recognition
      .groups = "drop"
    )
  
  # Get incoming ties for comparison
  receiving_metrics <- year_data %>%
    group_by(Destination) %>%
    summarize(
      incoming_ties = n(),  # Number of recognitions received
      .groups = "drop"
    )
  
  # Join outgoing and incoming
  combined <- sending_metrics %>%
    left_join(receiving_metrics, by = c("Sending Country" = "Destination")) %>%
    # Calculate recognition balance
    mutate(
      recognition_balance = incoming_ties - outgoing_ties,
      recognition_ratio = incoming_ties / outgoing_ties
    )
  
  return(combined)
}

# Calculate for 2020
self_recognition_2020 <- calculate_self_recognition(2020)

# Countries that recognize many others (high self-projection)
high_projectors <- self_recognition_2020 %>%
  arrange(desc(outgoing_ties)) %>%
  select(`Sending Country`, outgoing_ties, incoming_ties, recognition_balance, recognition_ratio) %>%
  head(15)

kable(high_projectors,
      caption = "Countries with Highest Diplomatic Self-Projection (2020)",
      col.names = c("Country", "Outgoing Ties", "Incoming Ties", "Recognition Balance", "Recognition Ratio"),
      digits = 2)
Countries with Highest Diplomatic Self-Projection (2020)
Country Outgoing Ties Incoming Ties Recognition Balance Recognition Ratio
China 173 170 -3 0.98
United States 171 188 17 1.10
France 162 155 -7 0.96
United Kingdom 162 166 4 1.02
Germany 153 160 7 1.05
Japan 153 153 0 1.00
Russia 144 148 4 1.03
Turkey 136 131 -5 0.96
Brazil 135 126 -9 0.93
India 132 150 18 1.14
Italy 127 137 10 1.08
Egypt 125 128 3 1.02
Cuba 123 100 -23 0.81
Spain 119 119 0 1.00
Korea, Republic of 118 112 -6 0.95
# Countries with highest recognition deficit (recognize more than recognized)
recognition_deficit <- self_recognition_2020 %>%
  arrange(recognition_balance) %>%
  select(`Sending Country`, outgoing_ties, incoming_ties, recognition_balance, recognition_ratio) %>%
  head(15)

kable(recognition_deficit,
      caption = "Countries with Highest Recognition Deficit (2020)",
      col.names = c("Country", "Outgoing Ties", "Incoming Ties", "Recognition Balance", "Recognition Ratio"),
      digits = 2)
Countries with Highest Recognition Deficit (2020)
Country Outgoing Ties Incoming Ties Recognition Balance Recognition Ratio
Libya 101 67 -34 0.66
Korea, Democratic People’s Republic of 54 23 -31 0.43
Venezuela 91 63 -28 0.69
Holy See (Vatican) 112 85 -27 0.76
Iraq 69 44 -25 0.64
Cuba 123 100 -23 0.81
Georgia 59 37 -22 0.63
Slovakia 63 41 -22 0.65
Sudan 72 56 -16 0.78
Somalia 33 19 -14 0.58
Eritrea 31 18 -13 0.58
El Salvador 40 28 -12 0.70
Bangladesh 57 46 -11 0.81
Ecuador 47 36 -11 0.77
Equatorial Guinea 39 28 -11 0.72
# Countries with highest recognition surplus (recognized more than recognize)
recognition_surplus <- self_recognition_2020 %>%
  arrange(desc(recognition_balance)) %>%
  select(`Sending Country`, outgoing_ties, incoming_ties, recognition_balance, recognition_ratio) %>%
  head(15)

kable(recognition_surplus,
      caption = "Countries with Highest Recognition Surplus (2020)",
      col.names = c("Country", "Outgoing Ties", "Incoming Ties", "Recognition Balance", "Recognition Ratio"),
      digits = 2)
Countries with Highest Recognition Surplus (2020)
Country Outgoing Ties Incoming Ties Recognition Balance Recognition Ratio
Belgium 83 183 100 2.20
Ethiopia 47 107 60 2.28
Singapore 27 73 46 2.70
Switzerland 103 143 40 1.39
Kenya 48 87 39 1.81
Austria 84 118 34 1.40
Canada 98 130 32 1.33
South Africa 106 133 27 1.25
Australia 83 107 24 1.29
Senegal 51 75 24 1.47
Tanzania 36 58 22 1.61
Malaysia 79 100 21 1.27
India 132 150 18 1.14
Mozambique 31 49 18 1.58
Trinidad and Tobago 13 30 17 2.31

Preparing Network Status Variables for Integration Using PCA

Let’s prepare the PCA-based network status variables for integration with the main dataset:

# Calculate PCA-based status scores for all benchmark years
pca_scores <- list()
overall_pca_scores <- list()

# Calculate for each year
for(year in benchmark_years) {
  pca_scores[[as.character(year)]] <- create_pca_status_scores(year)
  overall_pca_scores[[as.character(year)]] <- create_overall_pca_status(year)
}
## Year: 1960 
## Variance explained by recognition PC1: 97.426 %
## Variance explained by prestige PC1: 97.467 %
## Variance explained by brokerage PC1: 72.576 %
## 
## Recognition dimension loadings:
##    recognition_count weighted_recognition     recognition_rate 
##           -0.5812540           -0.5694625           -0.5812540 
## 
## Prestige dimension loadings:
##               pagerank              authority eigenvector_centrality 
##             -0.5698288             -0.5800420             -0.5821052 
## 
## Brokerage dimension loadings:
##       betweenness recognition_count 
##        -0.7071068        -0.7071068 
## 
## ----------------------------
## 
## Year: 1960 
## Variance explained by overall status PC1: 84.435 %
## Overall status loadings:
##      recognition_count   weighted_recognition               pagerank 
##             -0.4399850             -0.4268917             -0.4338473 
##              authority eigenvector_centrality            betweenness 
##             -0.4335300             -0.4358433             -0.2409265 
## 
## ----------------------------
## 
## Year: 1965 
## Variance explained by recognition PC1: 96.857 %
## Variance explained by prestige PC1: 96.41 %
## Variance explained by brokerage PC1: 60.746 %
## 
## Recognition dimension loadings:
##    recognition_count weighted_recognition     recognition_rate 
##           -0.5821709           -0.5675861           -0.5821709 
## 
## Prestige dimension loadings:
##               pagerank              authority eigenvector_centrality 
##             -0.5665539             -0.5807124             -0.5846279 
## 
## Brokerage dimension loadings:
##       betweenness recognition_count 
##        -0.7071068        -0.7071068 
## 
## ----------------------------
## 
## Year: 1965 
## Variance explained by overall status PC1: 80.25 %
## Overall status loadings:
##      recognition_count   weighted_recognition               pagerank 
##             -0.4532097             -0.4317983             -0.4416851 
##              authority eigenvector_centrality            betweenness 
##             -0.4439766             -0.4474075             -0.1256056 
## 
## ----------------------------
## 
## Year: 1970 
## Variance explained by recognition PC1: 96.666 %
## Variance explained by prestige PC1: 96.815 %
## Variance explained by brokerage PC1: 73.801 %
## 
## Recognition dimension loadings:
##    recognition_count weighted_recognition     recognition_rate 
##           -0.5824830           -0.5669454           -0.5824830 
## 
## Prestige dimension loadings:
##               pagerank              authority eigenvector_centrality 
##             -0.5680435             -0.5797003             -0.5841867 
## 
## Brokerage dimension loadings:
##       betweenness recognition_count 
##         0.7071068         0.7071068 
## 
## ----------------------------
## 
## Year: 1970 
## Variance explained by overall status PC1: 83.218 %
## Overall status loadings:
##      recognition_count   weighted_recognition               pagerank 
##             -0.4414904             -0.4213380             -0.4366745 
##              authority eigenvector_centrality            betweenness 
##             -0.4303423             -0.4357793             -0.2485515 
## 
## ----------------------------
## 
## Year: 1975 
## Variance explained by recognition PC1: 97.133 %
## Variance explained by prestige PC1: 97.657 %
## Variance explained by brokerage PC1: 72.692 %
## 
## Recognition dimension loadings:
##    recognition_count weighted_recognition     recognition_rate 
##           -0.5817240           -0.5685018           -0.5817240 
## 
## Prestige dimension loadings:
##               pagerank              authority eigenvector_centrality 
##             -0.5707521             -0.5787696             -0.5824669 
## 
## Brokerage dimension loadings:
##       betweenness recognition_count 
##        -0.7071068        -0.7071068 
## 
## ----------------------------
## 
## Year: 1975 
## Variance explained by overall status PC1: 83.352 %
## Overall status loadings:
##      recognition_count   weighted_recognition               pagerank 
##             -0.4414057             -0.4242693             -0.4374834 
##              authority eigenvector_centrality            betweenness 
##             -0.4322595             -0.4367541             -0.2369860 
## 
## ----------------------------
## 
## Year: 1980 
## Variance explained by recognition PC1: 96.604 %
## Variance explained by prestige PC1: 96.64 %
## Variance explained by brokerage PC1: 65.879 %
## 
## Recognition dimension loadings:
##    recognition_count weighted_recognition     recognition_rate 
##           -0.5825847           -0.5667364           -0.5825847 
## 
## Prestige dimension loadings:
##               pagerank              authority eigenvector_centrality 
##             -0.5675220             -0.5799258             -0.5844697 
## 
## Brokerage dimension loadings:
##       betweenness recognition_count 
##        -0.7071068        -0.7071068 
## 
## ----------------------------
## 
## Year: 1980 
## Variance explained by overall status PC1: 80.994 %
## Overall status loadings:
##      recognition_count   weighted_recognition               pagerank 
##             -0.4442745             -0.4301658             -0.4415678 
##              authority eigenvector_centrality            betweenness 
##             -0.4376899             -0.4434829             -0.1853265 
## 
## ----------------------------
## 
## Year: 1985 
## Variance explained by recognition PC1: 97.066 %
## Variance explained by prestige PC1: 96.821 %
## Variance explained by brokerage PC1: 67.116 %
## 
## Recognition dimension loadings:
##    recognition_count weighted_recognition     recognition_rate 
##           -0.5818316           -0.5682816           -0.5818316 
## 
## Prestige dimension loadings:
##               pagerank              authority eigenvector_centrality 
##             -0.5677712             -0.5805062             -0.5836509 
## 
## Brokerage dimension loadings:
##       betweenness recognition_count 
##         0.7071068         0.7071068 
## 
## ----------------------------
## 
## Year: 1985 
## Variance explained by overall status PC1: 82.281 %
## Overall status loadings:
##      recognition_count   weighted_recognition               pagerank 
##             -0.4460435             -0.4305488             -0.4354026 
##              authority eigenvector_centrality            betweenness 
##             -0.4400504             -0.4426168             -0.1911638 
## 
## ----------------------------
## 
## Year: 1990 
## Variance explained by recognition PC1: 97.461 %
## Variance explained by prestige PC1: 96.815 %
## Variance explained by brokerage PC1: 64.722 %
## 
## Recognition dimension loadings:
##    recognition_count weighted_recognition     recognition_rate 
##           -0.5811977           -0.5695774           -0.5811977 
## 
## Prestige dimension loadings:
##               pagerank              authority eigenvector_centrality 
##             -0.5677063             -0.5807112             -0.5835101 
## 
## Brokerage dimension loadings:
##       betweenness recognition_count 
##         0.7071068         0.7071068 
## 
## ----------------------------
## 
## Year: 1990 
## Variance explained by overall status PC1: 81.831 %
## Overall status loadings:
##      recognition_count   weighted_recognition               pagerank 
##             -0.4475435             -0.4337214             -0.4368558 
##              authority eigenvector_centrality            betweenness 
##             -0.4418720             -0.4441081             -0.1681210 
## 
## ----------------------------
## 
## Year: 1995 
## Variance explained by recognition PC1: 97.472 %
## Variance explained by prestige PC1: 96.784 %
## Variance explained by brokerage PC1: 62.918 %
## 
## Recognition dimension loadings:
##    recognition_count weighted_recognition     recognition_rate 
##           -0.5811814           -0.5696107           -0.5811814 
## 
## Prestige dimension loadings:
##               pagerank              authority eigenvector_centrality 
##             -0.5675619             -0.5809173             -0.5834455 
## 
## Brokerage dimension loadings:
##       betweenness recognition_count 
##        -0.7071068        -0.7071068 
## 
## ----------------------------
## 
## Year: 1995 
## Variance explained by overall status PC1: 81.526 %
## Overall status loadings:
##      recognition_count   weighted_recognition               pagerank 
##             -0.4483286             -0.4352306             -0.4369596 
##              authority eigenvector_centrality            betweenness 
##             -0.4437476             -0.4458243             -0.1515613 
## 
## ----------------------------
## 
## Year: 2000 
## Variance explained by recognition PC1: 97.614 %
## Variance explained by prestige PC1: 97.165 %
## Variance explained by brokerage PC1: 65.564 %
## 
## Recognition dimension loadings:
##    recognition_count weighted_recognition     recognition_rate 
##           -0.5809550           -0.5700725           -0.5809550 
## 
## Prestige dimension loadings:
##               pagerank              authority eigenvector_centrality 
##             -0.5687447             -0.5808227             -0.5823869 
## 
## Brokerage dimension loadings:
##       betweenness recognition_count 
##        -0.7071068        -0.7071068 
## 
## ----------------------------
## 
## Year: 2000 
## Variance explained by overall status PC1: 82.256 %
## Overall status loadings:
##      recognition_count   weighted_recognition               pagerank 
##             -0.4461475             -0.4348858             -0.4387163 
##              authority eigenvector_centrality            betweenness 
##             -0.4410802             -0.4423568             -0.1706560 
## 
## ----------------------------
## 
## Year: 2005 
## Variance explained by recognition PC1: 97.518 %
## Variance explained by prestige PC1: 97.487 %
## Variance explained by brokerage PC1: 66.278 %
## 
## Recognition dimension loadings:
##    recognition_count weighted_recognition     recognition_rate 
##           -0.5811082           -0.5697600           -0.5811082 
## 
## Prestige dimension loadings:
##               pagerank              authority eigenvector_centrality 
##             -0.5697438             -0.5807636             -0.5814685 
## 
## Brokerage dimension loadings:
##       betweenness recognition_count 
##         0.7071068         0.7071068 
## 
## ----------------------------
## 
## Year: 2005 
## Variance explained by overall status PC1: 82.438 %
## Overall status loadings:
##      recognition_count   weighted_recognition               pagerank 
##             -0.4462848             -0.4333804             -0.4396114 
##              authority eigenvector_centrality            betweenness 
##             -0.4415080             -0.4419477             -0.1717734 
## 
## ----------------------------
## 
## Year: 2010 
## Variance explained by recognition PC1: 97.664 %
## Variance explained by prestige PC1: 97.86 %
## Variance explained by brokerage PC1: 64.344 %
## 
## Recognition dimension loadings:
##    recognition_count weighted_recognition     recognition_rate 
##           -0.5808763           -0.5702328           -0.5808763 
## 
## Prestige dimension loadings:
##               pagerank              authority eigenvector_centrality 
##             -0.5709077             -0.5806520             -0.5804375 
## 
## Brokerage dimension loadings:
##       betweenness recognition_count 
##        -0.7071068        -0.7071068 
## 
## ----------------------------
## 
## Year: 2010 
## Variance explained by overall status PC1: 82.358 %
## Overall status loadings:
##      recognition_count   weighted_recognition               pagerank 
##             -0.4461710             -0.4344737             -0.4409849 
##              authority eigenvector_centrality            betweenness 
##             -0.4437519             -0.4432670             -0.1558688 
## 
## ----------------------------
## 
## Year: 2015 
## Variance explained by recognition PC1: 97.736 %
## Variance explained by prestige PC1: 98.014 %
## Variance explained by brokerage PC1: 64.07 %
## 
## Recognition dimension loadings:
##    recognition_count weighted_recognition     recognition_rate 
##           -0.5807629           -0.5704637           -0.5807629 
## 
## Prestige dimension loadings:
##               pagerank              authority eigenvector_centrality 
##             -0.5714183             -0.5804621             -0.5801249 
## 
## Brokerage dimension loadings:
##       betweenness recognition_count 
##        -0.7071068        -0.7071068 
## 
## ----------------------------
## 
## Year: 2015 
## Variance explained by overall status PC1: 82.41 %
## Overall status loadings:
##      recognition_count   weighted_recognition               pagerank 
##             -0.4462810             -0.4342422             -0.4412979 
##              authority eigenvector_centrality            betweenness 
##             -0.4442916             -0.4435678             -0.1528914 
## 
## ----------------------------
## 
## Year: 2020 
## Variance explained by recognition PC1: 97.672 %
## Variance explained by prestige PC1: 98.111 %
## Variance explained by brokerage PC1: 64.721 %
## 
## Recognition dimension loadings:
##    recognition_count weighted_recognition     recognition_rate 
##           -0.5808640           -0.5702578           -0.5808640 
## 
## Prestige dimension loadings:
##               pagerank              authority eigenvector_centrality 
##             -0.5717408             -0.5802542             -0.5800151 
## 
## Brokerage dimension loadings:
##       betweenness recognition_count 
##        -0.7071068        -0.7071068 
## 
## ----------------------------
## 
## Year: 2020 
## Variance explained by overall status PC1: 82.631 %
## Overall status loadings:
##      recognition_count   weighted_recognition               pagerank 
##             -0.4455466             -0.4329208             -0.4405960 
##              authority eigenvector_centrality            betweenness 
##             -0.4437657             -0.4431292             -0.1632527 
## 
## ----------------------------
# Function to prepare status variables with PCA scores
prepare_status_variables_pca <- function(year_val) {
  # Network metrics
  network_metrics <- status_metrics[[as.character(year_val)]]
  
  # Calculate inconsistency
  inconsistency <- calculate_inconsistency(network_metrics)
  
  # Self-recognition metrics
  self_rec <- calculate_self_recognition(year_val)
  
  # Get PCA scores
  pca_year <- pca_scores[[as.character(year_val)]]
  overall_pca_year <- overall_pca_scores[[as.character(year_val)]]
  
  # Join metrics
  combined <- network_metrics %>%
    # Join inconsistency metrics
    left_join(
      inconsistency %>% select(country, network_inconsistency),
      by = "country"
    ) %>%
    # Join self-recognition metrics
    left_join(
      self_rec %>% select(`Sending Country`, outgoing_ties, recognition_balance, recognition_ratio),
      by = c("country" = "Sending Country")
    ) %>%
    # Join PCA scores
    left_join(pca_year, by = "country") %>%
    left_join(overall_pca_year, by = "country") %>%
    # Add year
    mutate(year = year_val) %>%
    # Calculate external-internal recognition balance (objective vs subjective status)
    mutate(
      external_internal_ratio = if_else(outgoing_ties > 0, recognition_count / outgoing_ties, NA_real_)
    )
  
  return(combined)
}

# Prepare variables for each benchmark year using PCA scores
network_status_variables_pca <- lapply(benchmark_years, prepare_status_variables_pca)
names(network_status_variables_pca) <- as.character(benchmark_years)

# Create a combined dataframe
all_network_status_pca <- do.call(rbind, network_status_variables_pca)

# Preview the final dataset with PCA-based scores
head(all_network_status_pca %>% 
     select(country, year, recognition_count, 
            recognition_status_pca, prestige_status_pca, 
            brokerage_status_pca, overall_status_network_pca))
##               country year recognition_count recognition_status_pca
## 1960.1 Czechoslovakia 1960                39              0.5300235
## 1960.2          Egypt 1960                58              0.7343870
## 1960.3         France 1960                82              1.0000000
## 1960.4        Germany 1960                66              0.8190107
## 1960.5          India 1960                55              0.6415824
## 1960.6      Indonesia 1960                33              0.4061870
##        prestige_status_pca brokerage_status_pca overall_status_network_pca
## 1960.1           0.4395712            0.3878857                  0.5032924
## 1960.2           0.6257682            0.3205850                  0.6427105
## 1960.3           1.0000000            0.6361345                  1.0000000
## 1960.4           0.7860529            1.0000000                  0.9148921
## 1960.5           0.6041636            0.6047510                  0.6562076
## 1960.6           0.4038194            0.5157629                  0.4653441
# This dataset can now be joined with the fully_integrated_data
# using the country names and years as keys
write.csv(all_network_status_pca, file = "all_network_status_pca.csv", row.names = FALSE)

Conclusion

This network analysis of diplomatic representation provides rich measures of status in the international system:

  1. External Recognition Metrics:
    • Recognition count and weighted recognition capture the quantity of recognition
    • PageRank and authority scores measure prestige-weighted recognition
    • Betweenness centrality captures brokerage status
  2. Self-Recognition Metrics:
    • Outgoing ties measure how countries project themselves
    • Recognition balance and ratio reveal discrepancies between given and received recognition
  3. Status Inconsistency Measures:
    • Network inconsistency identifies mismatches between different forms of recognition
    • External-internal ratio reveals objective vs. subjective status gaps
  4. Composite Status Scores:
    • Recognition status combines quantity and quality measures
    • Prestige status emphasizes recognition from important states
    • Brokerage status captures diplomatic bridging roles

These network metrics can be integrated with other status dimensions (material capabilities, institutional integration) to create a comprehensive understanding of international status.